Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in Australia (2311.13755v1)
Abstract: The construction industry in Australia is characterized by its intricate supply chains and vulnerability to myriad risks. As such, effective supply chain risk management (SCRM) becomes imperative. This paper employs different transformer models, and train for Named Entity Recognition (NER) in the context of Australian construction SCRM. Utilizing NER, transformer models identify and classify specific risk-associated entities in news articles, offering a detailed insight into supply chain vulnerabilities. By analysing news articles through different transformer models, we can extract relevant entities and insights related to specific risk taxonomies local (milieu) to the Australian construction landscape. This research emphasises the potential of NLP-driven solutions, like transformer models, in revolutionising SCRM for construction in geo-media specific contexts.
- D. Khurana, K.K. A. Koli, Singh, S.: Natural language processing: state of the art, current trends and challenges. Multimedia Tools and Applications. 82(3), 3713–3744 (2023) Ghazizadeh and Zhu. [2020] Ghazizadeh, E., Zhu., P.: A Systematic Literature Review of Natural Language Processing: Current State, Challenges and Risks. Proceedings of the Future Technologies Conference (FTC) (2020) Joshi [1991] Joshi, A.K.: Natural language processing. Science. 253(5025), 1242–1249 (1991) Hirst et al. [2013] Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghazizadeh, E., Zhu., P.: A Systematic Literature Review of Natural Language Processing: Current State, Challenges and Risks. Proceedings of the Future Technologies Conference (FTC) (2020) Joshi [1991] Joshi, A.K.: Natural language processing. Science. 253(5025), 1242–1249 (1991) Hirst et al. [2013] Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Joshi, A.K.: Natural language processing. Science. 253(5025), 1242–1249 (1991) Hirst et al. [2013] Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Ghazizadeh, E., Zhu., P.: A Systematic Literature Review of Natural Language Processing: Current State, Challenges and Risks. Proceedings of the Future Technologies Conference (FTC) (2020) Joshi [1991] Joshi, A.K.: Natural language processing. Science. 253(5025), 1242–1249 (1991) Hirst et al. [2013] Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Joshi, A.K.: Natural language processing. Science. 253(5025), 1242–1249 (1991) Hirst et al. [2013] Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Joshi, A.K.: Natural language processing. Science. 253(5025), 1242–1249 (1991) Hirst et al. [2013] Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Hirst, G., Hovy, E., Johnson, M.: Theory and Applications of Natural Language Processing. Springer (2013) Francis et al. [2019] Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Francis, S., Van Landeghem, J., Moens, M.-F.: Transfer learning for named entity recognition in financial and biomedical documents. Information 10(8), 248 (2019) Alexander and de Vries [2021] Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Alexander, D., Vries, A.P.: " this research is funded by…": Named entity recognition of financial information in research papers (2021) Hillebrand et al. [2022] Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Hillebrand, L., Deußer, T., Dilmaghani, T., Kliem, B., Loitz, R., Bauckhage, C., Sifa, R.: Kpi-bert: A joint named entity recognition and relation extraction model for financial reports. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 606–612 (2022). IEEE Śniegula et al. [2019] Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Śniegula, A., Poniszewska-Marańda, A., Chomątek, Ł.: Study of named entity recognition methods in biomedical field. Procedia Computer Science 160, 260–265 (2019) Perera et al. [2020] Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Perera, N., Dehmer, M., Emmert-Streib, F.: Named entity recognition and relation detection for biomedical information extraction. Frontiers in cell and developmental biology, 673 (2020) Landolsi et al. [2022] Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Landolsi, M.Y., Romdhane, L.B., Hlaoua, L.: Medical named entity recognition using surrounding sequences matching. Procedia Computer Science 207, 674–683 (2022) Moon et al. [2021] Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Moon, S., Lee, G., Chi, S., Oh, H.: Automated construction specification review with named entity recognition using natural language processing. Journal of Construction Engineering and Management 147(1), 04020147 (2021) Zhang et al. [2023] Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Zhang, Q., Xue, C., Su, X., Zhou, P., Wang, X., Zhang, J.: Named entity recognition for chinese construction documents based on conditional random field. Frontiers of Engineering Management 10(2), 237–249 (2023) Jeon et al. [2022] Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Jeon, K., Lee, G., Yang, S., Jeong, H.D.: Named entity recognition of building construction defect information from text with linguistic noise. Automation in Construction 143, 104543 (2022) Shaalan and Raza [2008] Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Shaalan, K., Raza, H.: Arabic named entity recognition from diverse text types. In: Advances in Natural Language Processing: 6th International Conference, GoTAL 2008 Gothenburg, Sweden, August 25-27, 2008 Proceedings, pp. 440–451 (2008). Springer Alfred et al. [2014] Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Alfred, R., Leong, L.C., On, C.K., Anthony, P.: Malay named entity recognition based on rule-based approach (2014) Li et al. [2022] Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Li, X., Wang, T., Pang, Y., Han, J., Shi, J.: Review of research on named entity recognition. In: International Conference on Artificial Intelligence and Security, pp. 256–267 (2022). Springer Sarawagi and Cohen [2004] Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Sarawagi, S., Cohen, W.W.: Semi-markov conditional random fields for information extraction. Advances in neural information processing systems 17 (2004) Wallach [2004] Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Wallach, H.M.: Conditional random fields: An introduction. Technical Reports (CIS), 22 (2004) Koroteev [2021] Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Koroteev, M.: Bert: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021) Abioye et al. [2021] Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Abioye, S.O., Oyedele, L.O., Akanbi, L., Ajayi, A., Delgado, J.M.D., Bilal, M., Akinade, O.O., Ahmed, A.: Artificial intelligence in the construction industry: A review of present status, opportunities and future challenges. Journal of Building Engineering 44, 103299 (2021) Chiu and Nichols [2016] Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. Transactions of the association for computational linguistics 4, 357–370 (2016) Wang et al. [2014] Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Wang, S., Xu, R., Liu, B., Gui, L., Zhou, Y.: Financial named entity recognition based on conditional random fields and information entropy. In: 2014 International Conference on Machine Learning and Cybernetics, vol. 2, pp. 838–843 (2014). IEEE Miwa and Bansal [2016] Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Miwa, M., Bansal, M.: End-to-end relation extraction using lstms on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016) Peters et al. [2018] Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 Brown et al. [2020] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Advances in neural information processing systems 33, 1877–1901 (2020) Devlin et al. [2018] Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) Dai et al. [2019] Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Dai, Z., Wang, X., Ni, P., Li, Y., Li, G., Bai, X.: Named entity recognition using bert bilstm crf for chinese electronic health records. In: 2019 12th International Congress on Image and Signal Processing, Biomedical Engineering and Informatics (cisp-bmei), pp. 1–5 (2019). IEEE Yang et al. [2022a] Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Yang, T.-H., Pleva, M., Hládek, D., Su, M.-H.: Bert-based chinese medicine named entity recognition model applied to medication reminder dialogue system. In: 2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 374–378 (2022). IEEE Yang et al. [2022b] Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Yang, C., Sheng, L., Wei, Z., Wang, W.: Chinese named entity recognition of epidemiological investigation of information on covid-19 based on bert. Ieee Access 10, 104156–104168 (2022) Chen et al. [2023] Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Chen, D., Liu, C., Zhao, Z.: Named entity recognition service of bert-transformer-crf based on multi-feature fusion for chronic disease management. In: International Conference on Service Science, pp. 166–178 (2023). Springer Yu et al. [2022] Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Yu, Y., Wang, Y., Mu, J., Li, W., Jiao, S., Wang, Z., Lv, P., Zhu, Y.: Chinese mineral named entity recognition based on bert model. Expert Systems with Applications 206, 117727 (2022) Tang et al. [2023] Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Tang, X., Huang, Y., Xia, M., Long, C.: A multi-task bert-bilstm-am-crf strategy for chinese named entity recognition. Neural Processing Letters 55(2), 1209–1229 (2023) Gorla et al. [2022] Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Gorla, S., Tangeda, S.S., Neti, L.B.M., Malapati, A.: Telugu named entity recognition using bert. International Journal of Data Science and Analytics 14(2), 127–140 (2022) Jarrar et al. [2022] Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Jarrar, M., Khalilia, M., Ghanem, S.: Wojood: Nested arabic named entity corpus and recognition using bert. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3626–3636 (2022) Lun et al. [2022] Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Lun, Z., Hui, Z., et al.: Research on agricultural named entity recognition based on pre train bert. Academic Journal of Engineering and Technology Science 5(4), 34–42 (2022) Ndukwe et al. [2021] Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Ndukwe, C.V., Liu, J., Chan, T.K.: Impact of covid-19 on the china-australia construction supply chain. In: Proceedings of the 25th International Symposium on Advancement of Construction Management and Real Estate, pp. 1275–1291 (2021). Springer Huang et al. [2022] Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Huang, C., Wang, Y., Yu, Y., Hao, Y., Liu, Y., Zhao, X.: Chinese named entity recognition of geological news based on bert model. Applied Sciences 12(15), 7708 (2022) Qiu et al. [2020] Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Qiu, Q., Xie, Z., Wu, L., Tao, L.: Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques. Earth Science Informatics 13, 1393–1410 (2020) Lv et al. [2022] Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Lv, X., Xie, Z., Xu, D., Jin, X., Ma, K., Tao, L., Qiu, Q., Pan, Y.: Chinese named entity recognition in the geoscience domain based on bert. Earth and Space Science 9(3), 2021–002166 (2022) Bengio et al. [2000] Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Advances in neural information processing systems 13 (2000) Mikolov et al. [2013a] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013) Mikolov et al. [2013b] Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013) Johnson et al. [2023] Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Johnson, S.J., Murty, M.R., Navakanth, I.: A detailed review on word embedding techniques with emphasis on word2vec. Multimedia Tools and Applications, 1–29 (2023) Almeida and Xexéo [2019] Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Almeida, F., Xexéo, G.: Word embeddings: A survey. arXiv preprint arXiv:1901.09069 (2019) Asudani et al. [2023] Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Asudani, D.S., Nagwani, N.K., Singh, P.: Impact of word embedding models on text analytics in deep learning environment: a review. Artificial Intelligence Review, 1–81 (2023) Vaswani et al. [2017] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017) Turner [2023] Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Turner, R.E.: An introduction to transformers. arXiv preprint arXiv:2304.10557 (2023) Chernyavskiy et al. [2021] Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Chernyavskiy, A., Ilvovsky, D., Nakov, P.: Transformers:“the end of history” for natural language processing? In: Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13–17, 2021, Proceedings, Part III 21, pp. 677–693 (2021). Springer Luitse and Denkena [2021] Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Luitse, D., Denkena, W.: The great transformer: Examining the role of large language models in the political economy of ai. Big Data & Society 8(2), 20539517211047734 (2021) Sabharwal et al. [2021] Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Sabharwal, N., Agrawal, A., Sabharwal, N., Agrawal, A.: Bert algorithms explained. Hands-on Question Answering Systems with BERT: Applications in Neural Networks and Natural Language Processing, 65–95 (2021) Wang et al. [2020] Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Wang, B., Shang, L., Lioma, C., Jiang, X., Yang, H., Liu, Q., Simonsen, J.G.: On position embeddings in bert. In: International Conference on Learning Representations (2020) Huangliang et al. [2023] Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Huangliang, K., Li, X., Yin, T., Peng, B., Zhang, H.: Self-adapted positional encoding in the transformer encoder for named entity recognition. In: International Conference on Artificial Neural Networks, pp. 538–549 (2023). Springer Ghojogh and Ghodsi [2020] Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Ghojogh, B., Ghodsi, A.: Attention mechanism, transformers, bert, and gpt: tutorial and survey (2020) Liu et al. [2019] Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V.: Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019) Sanh et al. [2019] Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019) Lan et al. [2019] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019) Clark et al. [2020] Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Clark, K., Luong, M., Le, Q.V., Manning, C.D.: ELECTRA: pre-training text encoders as discriminators rather than generators. CoRR abs/2003.10555 (2020) 2003.10555 Raffel et al. [2019] Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019) 1910.10683 Wang et al. [2023] Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., Wang, G.: Gpt-ner: Named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023) Zhang and Li [2021] Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Zhang, M., Li, J.: A commentary of gpt-3 in mit technology review 2021. Fundamental Research 1(6), 831–833 (2021) Coburn et al. [2013] Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Coburn, A., Ralph, D., Tuveson, M., Ruffle, S., Bowman, G.: A taxonomy of threats for macro-catastrophe risk management. Centre for Risk Studies, Cambridge: University of Cambridge, Working Paper, July, 20–24 (2013) Sang and De Meulder [2003] Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Sang, E.F., De Meulder, F.: Introduction to the conll-2003 shared task: Language-independent named entity recognition. arXiv preprint cs/0306050 (2003) Clark and Luong [2020] Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Clark, K., Luong, T.: More efficient nlp model pre-training with electra. Preuzeto s from https://ai. googleblog. com/2020/03/more-efficient-nlp-model-pre-training. html [4. srpnja 2021.] (2020) Cortiz [2022] Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Cortiz, D.: Exploring transformers models for emotion recognition: a comparision of bert, distilbert, roberta, xlnet and electra. In: Proceedings of the 2022 3rd International Conference on Control, Robotics and Intelligent System, pp. 230–234 (2022) Adoma et al. [2020] Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Adoma, A.F., Henry, N.-M., Chen, W.: Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), pp. 117–121 (2020). IEEE Oh et al. [2022] Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Oh, S.H., Kang, M., Lee, Y.: Protected health information recognition by fine-tuning a pre-training transformer model. Healthcare Informatics Research 28(1), 16–24 (2022) Wu et al. [2021] Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Wu, Y., Huang, J., Xu, C., Zheng, H., Zhang, L., Wan, J.: Research on named entity recognition of electronic medical records based on roberta and radical-level feature. Wireless Communications and Mobile Computing 2021, 1–10 (2021) Quijano et al. [2021] Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Quijano, A.J., Nguyen, S., Ordonez, J.: Grid search hyperparameter benchmarking of bert, albert, and longformer on duorc. arXiv preprint arXiv:2101.06326 (2021) Foumani et al. [2023] Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023) Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
- Foumani, N.M., Tan, C.W., Webb, G.I., Salehi, M.: Improving position encoding of transformers for multivariate time series classification. arXiv preprint arXiv:2305.16642 (2023)
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.